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Abstract. Let G{q) be a finite Chevalley group, where q is a power of a good prime p, and 
let U (q) be a Sylow p-subgroup of G(q) . Then a generalized version of a conjecture of Higman 
asserts that the number k(U(q)) of conjugacy classes in U(q) is given by a polynomial in q 
with integer coefficients. In [12], the first and the third authors developed an algorithm to 
calculate the values of k(U(q)). By implementing it into a computer program using GAP, 
they were able to calculate k(U(q)) for G of rank at most 5, thereby proving that for these 
cases k(U(q)) is given by a polynomial in q. In this paper we present some refinements 
and improvements of the algorithm that allow us to calculate the values of k(U(q)) for 
finite Chevalley groups of rank six and seven, except E-j. We observe that k(U(q)) is a 
polynomial, so that the generalized Higman conjecture holds for these groups. Moreover, if 
we write k(U(q)) as a polynomial in q — 1, then the coefficients are non-negative. 

Under the assumption that k(U(q)) is a polynomial in q — 1, we also give an explicit 
formula for the coefficients of k{U{q)) of degrees zero, one and two. 



1. Introduction 

Let GL n (g) be the group of invertible n x n matrices with coefficients in the finite field 
F q , where q is a power of the prime p, and let U n (g) be the subgroup of unipotent upper 
triangular matrices. A well-known conjecture attributed to Higman is that the number 
k(U n (q)) of U n (g)-conjugacy classes in U n (g) is given by a polynomial in q with integer 
coefficients independent of q, see [14]. This conjecture has attracted the interest of many 
mathematicians including Thompson [23] and Robinson [21]. 

Using computer calculations, Vera-Lopez and Arregi verified in [24] that the conjecture 
holds for n < 13. The resulting polynomials have the additional property that, considered as 
polynomials in q — 1, their coefficients are non-negative integers. We also note that Evseev 
has calculated these polynomials via an alternative approach, see [6]. 

Since the conjugacy classes of a finite group are in bijective correspondence with its com- 
plex irreducible characters, one can also approach the conjecture via character theory. This 
has been considered among others by Andre [2], Isaacs [18] and Lehrer [20]. 

In this paper, we consider a generalization of Higman's conjecture. Let G(q) be a finite 
Chevalley group, i.e. the group of F^-rational points of a simple algebraic group G which is 
defined and split over F q . Assume that p is good for G(q) and let U (q) be a Sylow p-subgroup 
of G(q). Then the generalized conjecture claims that the number k(U(q)) of [/(g)-conjugacy 
classes is given by a polynomial in q, and as a polynomial in q — 1 it has non-negative integer 
coefficients. 
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There has been a lot of interest recently in the conjugacy classes and the complex characters 
of U(q), some of which gives evidence for (the generalized) Higman's conjecture. For example, 
in [1], Alperin proved that the number k(XJ n (q), GL n (q)) of U n (g)-conjugacy classes in GL n (g) 
is given by a polynomial in q with integer coefficients. This was generalized by the first and 
the third authors in [11] where they showed that k(U(q),G(q)) is a polynomial in q when 
the centre of G is connected and G is not of type E 8 . In case G is of type Eg, the number 
k(U(q),G(q)) is given by one of two polynomials, depending on q mod 3. It is conceivable 
that similar PORC (Polynomial On Residue Classes) behaviour occurs for k(U(q)) if G is 
of type E 8 . For other recent developments, see for example [13], [15], [16] and [19]. 

An algorithm was introduced in [12] to calculate a parameterization of the conjugacy 
classes of U(q), and thus determine k(U(q)). In this paper we describe an improved version 
of this algorithm. Both versions of the program have been implemented in GAP [7]. The 
main idea of the algorithm (which is based on the results in [8]) is to replace the task of 
counting conjugacy classes by the geometric task of counting F^-rational points of quasi-affine 
varieties over finite fields, which parametrize the conjugacy classes of U(q). The goal of the 
program is to determine these varieties and then calculate the number of F^-rational points 
from the polynomial equations which define them. In the previous version of the algorithm, 
it was necessary to inspect some output of the program and complete some calculations by 
hand and it was only possible to calculate k(U(q)) when the rank of G is at most 5. 

The improved version of the algorithm determines the polynomial equations much more 
effectively, and is significantly better at calculating the number of rational points in the 
varieties directly. With the aid of the improved version, we are able to prove the following 
theorem. 

Theorem 1.1. Let G be a split simple algebraic group defined over ¥ q of rank at most 7, 
excluding E-j, where, q is a power of a good prime p. Let U be a maximal unipotent subgroup 
of G which is also defined over¥ q . Then the number k(U(q)) of U(q)- conjugacy classes in 
U (q) is given by a polynomial in q with integer coefficients. Furthermore, if one considers 
k(U(q)) as a polynomial in q — 1, then the coefficients are non-negative. 

The theoretical background underlying our algorithm only holds for good primes. There- 
fore, the values for k(U(q)) given in this paper are only valid in this case. In [4], Bradley and 
the first author calculated k(U(q)) for q a power of a bad prime and G of rank at most 4, 
excluding In these cases k(U(q)) is again given by polynomials with integer coefficients 
which are expectedly not the same as those for good p. 

As mentioned in [12], it is straightforward to adapt the program to calculate the num- 
ber of M(g)-classes in Ni(q)/N 2 (q), where M, A r 1 ,A r 2 are normal unipotent subgroups of a 
Borel subgroup of G defined over ¥ q . For example, it is possible to calculate the number 
k(U(q),U®(q)) of U(q) -conjugacy classes in the Z-th term of the descending central series of 
U(q) for I G N. We have done this for some cases where G is of type Ey and E%. Here we 
also found that all values which we calculated were given by polynomials in q with integer 
coefficients. While the previous version of the algorithm could calculate k(U(q), U^(q)) for 
G of type E 8 only for I > 10, the new version is able to compute all values for I > 7. 

Our program is based on the algorithm outlined in [8] and also uses ideas by Biirgstein 
and Hesselink [5] as well as Vera-Lopez and Arregi [24]. 

We now give a brief outline of the structure of the paper. In Section 2 we give a summary 
of the theoretical results that the program is based on. The algorithm, with an emphasis 
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on the improvements that have been made, is described in Section 3 and the results of our 
calculations are presented in Section 4. Finally in Section 5, we prove explicit formulas for the 
coefficients of k(U(q)) of degrees zero, one and two, assuming that k(U(q)) is a polynomial 
in q — 1. 

2. Theoretical background 

Let G be a connected reductive algebraic group, defined and split over the finite field ¥ q 
with q elements, where q is a power of a prime p. Assume that p is a good prime for G and 
let K be the algebraic closure of ¥ q . We identify G with its group of points over K and 
write G(q) for the group of F g -rational points of G. Let B be a Borel subgroup of G defined 
over W q , containing a maximal torus T defined over ¥ q . The unipotent radical U of B is also 
defined over ¥ q and the group U (q) of F g -rational points of U is a Sylow p-subgroup of G(q). 
We write u for the Lie algebra of U and u(q) for its space of F g -rational points. 

Below we recall some of the results from [8] and [10] on which our algorithm for calculating 
k(U(q)) is based. We note that [9, Thm. 1.1] implies that some of the results in [8] hold in 
greater generality than stated there and allow ourselves to give the more general statements 
below. Thanks to [8, Prop. 6.2], we know that the conjugacy classes of U(q) are in bijective 
correspondence with the adjoint [/(g)-orbits in u(q), so we are henceforth primarily concerned 
with these orbits. 

Let $ be the root system determined by G and T and let $ + be the set of positive roots 
determined by B. Denote by =<! the partial order on $ determined by $ + . Let iV be the 
cardinality of $ + and fix an enumeration of $ + = . . . ,Pn} such that % < j whenever 
j3{ =4 P j- Let 0/3 be the root subspace of g for (3 G $, and fix a Chevalley basis {ep \ (3 G $ + } 
for u with G u(q) for each (3 G $ + . For < % < N, we define 

A' 

m * = 0/V 

j=i+l 

For x G u, denote by x + Ke^ + rrij the coset {x + a^e^ + rrii | a, G i^} in u/ttij. We have 
the following dichotomy given by [8, Lem. 5.1]: 

(I) Either all elements of x + i^e^ + rrij are conjugate in u/rrij by [/ (in which case we 

call i an inert point of x) , or 
(R) no two elements of x + i^e^ + rrij are conjugate in u/trij by U (in which case we call 
% a ramification point of x). 

An element x = a i e Pi G u is said to be the minimal representative of its [/-orbit if 
di = whenever % is an inert point of x. It follows from [8, Prop. 5.4 and Lem. 5.5] that 
each [/-orbit in u contains a unique minimal representative. For our algorithm we use the 
following characterization of minimal representatives, which follows immediately from the 
discussion above. 

Lemma 2.1. Let x = X]jLi a j e /3j £ u - Then x is the minimal representative of its U -orbit 
if ai — whenever dim c u (x + m*) = dim c u (x + tlli-i) — 1. 

A crucial theoretical result is that one can view the minimal representatives as elements 
of certain quasi-affine varieties. Let c G {I, R}^ and define 

u c = {x G u | i is an inert point of x if and only if Cj = 1} 
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and 

X c = I ^ a i e Pi G U c a i = if Cj = I 



A? 



Then, as stated in [10, Lem. 4.2], we have that X c is a locally closed subvariety of u, and the 
adjoint {/-orbits in u are in bijection with the points of UcGjiR}^ Moreover, this implies 
that the {/(g)-orbits in u(q) are parameterized by the X c (q) = X c D u(q). 

In the next section we describe our algorithm for calculating k(U(q)). The idea of the 
algorithm is to determine a decomposition of the varieties X c as a disjoint union of locally 
closed subvarieties, where these subvarieties are given by the vanishing and non-vanishing of 
explicitly determined polynomials. The key step in calculating these polynomials is to use 
the characterization of minimal representatives given by Lemma 2.1. The algorithm then 
proceeds to determine the number of F g -rational points in these subvarieties. 



3. The algorithm 

In this section we describe our algorithm for calculating a parameterization of the adjoint 
{/-orbits in u and then determining k(U(q)). In particular, we outline the improvements to 
the algorithm from [12] that have enabled us to calculate k(U(q)) for Chevalley groups of 
rank six and seven. The proof given in [12, §3] that the algorithm correctly determines all 
minimal representatives of U (g)-orbits in u(q) remains valid here, so we only concentrate on 
explaining the algorithm. 

Before going into some details, we give an overview of how the algorithm works. As 
mentioned in the previous section, it is based on explicitly determining the varieties X c for 

ce{l,Rf. 

In fact, first we want to generalize our notation. Let c G {I, Ro, Rn}* for some i — 1, . . . , N. 
Then we define 

u c = {x + rrij G Ui | j is an inert point of x if and only if Cj = 1} 

and 



x c= \ a i e Pi + mi e Uc 



dj = if and only if Cj G {I, Ro} 



We define m c to be the number of j with Cj = R n . Denote by 1 < k\ < ■ ■ ■ < k mc < N the 
indices with c^. = R n and define /3 CJ - = 0^ . Then we can write each element of X c in the 
form x c (a) + rrij := J^j=i a j e Pci + m «' where a G K mc . Thus we can canonically identify X c 
with a subvariety of (K x ) mc . 

Each X c can be written as a disjoint union X c = U[Li "where 

x c '■= i x c(a) + VXi G X c | f(a) = for all / G A L C and g(a) ^ for all g G B" c }, 

and A L C , B L C C K[t±, . . . , i m J. We call the X L C families of minimal representatives. Given 
A,BC K[t±, . . . , t mc ], we define 

X c ,a,b '■= {x c (a) + irij G X c | f(a) = for all / G A and g{a) ^ for all g G B}. 

Thus we have X c = U[ c =1 X c ^ c ,b i c - Note that X c might be a single family (and often is). 
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The goal of the algorithm is to calculate A L C ,B L C C K[ti, . . . ,t mc ] as above for each c G 
{I, Ro, Rn}^- Then the [/-orbits in u are parameterized by the union of all the X L C and 
k(U(q)) is given by the sum of the 

During the main loop of the algorithm we are considering c G {I, Ro, R n } i_1 and looking 
at a family X c ^a,b- We also have a "stack" of other families that we will consider later: the 
algorithm is a depth-first backtrack algorithm which calculates each family to the end, before 
considering the next family from the stack. 

The key step involves determining the variety X^\ B , which consists of elements of the 
form x c (a) + be^ + rrij such that x c (a) + m^i G X c a,b and x c (a) + be^ + rrij is the minimal 
representative of its [/-orbit. To do this we first calculate dimc u (x c (a) + m,) so we can 
apply Lemma 2.1 to determine conditions on a for whether i is an inert or ramification 
point of x c (a). Then the main objective is to decompose X^\ B into families. The algorithm 
ends up with triples [c\ A\ B L ) for i = l,...,k c , c L = (c, q) with q G {I,R ,R n } and 
A\ B L C Z[ii, . . . ,t m< j.] with A C A L and B C B L . Moreover, these triples are such that we 
have a disjoint union 

(3.1) X c,A,B = {J X ^,A L ,B^ 

The determination of these families is sometimes easy, for example often we have that % is 
an inert point for all x c (a) with x c (a) + m.j_i G X Ci a,b- However, determining (c\ A L , B L ) 
may require some complicated analysis of polynomials. A major improvement in the level of 
analysis applied here is one of the significant additions to the previous algorithm from [12]. 

The algorithm then proceeds by considering (c 1 , A 1 , .B 1 ) and adding (c L ,A L ,B L ) for i = 
2, . . . , k c to the stack. 

After all of the families X C) a,b have been processed, the number of F g -rational points 
\X Cj a,b{q)\ is determined. These numbers are summed together to calculate k(U(q)). Con- 
siderable improvement to the processes for calculating \X Ct A,B(o)\ are made in this new 
algorithm. In particular, the algorithm aims to ensure that the polynomials in the sets A 
and B are linear in one indeterminate, which enables the calculation to be made. 

We proceed to give a more detailed description of the algorithm. First we need to give 
some notation that is required in this explanation. 

Let q be the Lie algebra of G and Qc be the Lie algebra over C of the same type. Fix 
a Chevalley basis of Qc an d denote by Qz the Z-lattice spanned by this Chevalley basis, so 
that g = K ®z Q z . Define 

u := Z[ii, ...,t m ]® z uz, 

where m := max{m c | c G {I, Ro, Rn}^, X c ^ 0}. We allow ourselves to view e@. G u, where 
e^, . . . , ep N are elements of the Chevalley basis of in uz- For i < N and c G {I, Ro, R n }\ we 
define 

m c 

*c{t) ■= ^tjep^ G u. 

For a = (a±, . . . ,a m J G (K x ) nic , we define x c (a) G u by substituting tj = aj in x c (t). Let 
yi, ■ ■ ■ ,Vn be indeterminates. It is shown in [12, §3] that Ylj=iVj e Pj e c u( x c( a )) if and only 

5 



if (yx, . . . , i)n) is a solution of a certain system of linear equations 

N 

fc=i 

where Pjk{t) G Z[i] are linear polynomials determined by the Chevalley commutator re- 
lations. To find out for which a we have dimc u (x c (a) + rrij) < dimc u (x c (a) + m 4 _i), so 
that Lemma 2.1 can be applied, one has to check for which a the rank of the matrix 
(Pjk( a ))j,k £ Mat(j_i)xjv(Z) increases when one appends the z-th row. 

Now we give the data that the algorithm is holding at any point during a run. Note that 
the first three data elements c, A and B uniquely determine a family X L C which we also 
denote by X Cj a,b as above. We give some explanation of the meaning of the data here, but 
parts can only be fully understood once we have described the algorithm. We use speech 
marks to identify terminology that has not been explained. 

• The tuple c G {I, Ro, R n } 1 which determines x c (t) G Uj. 

• The set A of polynomials in Z[t] which vanish on X L C . 

• The tuple B of polynomials in Z[i] which have no roots in X L C . 

• For each / G A U B, we have associated <r(f), which is either equal to or an 
indeterminate tj in which / is linear. 

• The matrix Q(t) G Matj X iv(Z[t]) which comprises the first % rows of {Pjk{t))j,k in 
row-reduced form. 

• The tuple ir containing the "pivots" used for the first i row- reductions of Q(t). 

• The stack S := {(c, A, B, ir, a, Q(t))}, an ordered subset of 

N N 

|J{I, R , RJ'' x V(Z[t]) x |J Z[t]< x |J{0, 1, . . . , NY x (J{*i, • • • , t m } 1 x Mat ixJV (Z[t]) 

i=l i£N i=l i£N 

containing information about families that the program has not processed yet. 

• The "good-families" set 7, which contains for each family already processed enough 
data from which one can recover the number of F^-rational points in the family. 

• The "bad output" O := {(c, A, B)}, a subset of 

{l,R ,R n } N xV(Z[t}) x\Jz[ty 

containing sufficient information about each "bad" family. 
At the beginning of the program, we have the configuration 

• c := (R n ), 

• A := 0, 

• B := 0, 

• o := 0, 

• vr := (0), 

• Q(t) := G MaW(Z[*]), 

• O := 0, 

• 7 := 0, and 

• S :={(R ), 0,0,(0), 0,0}. 
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The main loop in the algorithm is explained as follows. At each point we are considering 
a family X L C = X c> a,b as above. In the explanation below we sometimes speak of relevant a, 
by which we mean a such that x c (a) G X c ^a,b- 

Case 1: If the length % — 1 of c is smaller than N, then we generate the ith row of the matrix 
(Pjk(t))j,k and append it to Q(t). Then the following operations are applied to row reduce 
the i-th row. For all 1 < j < i — 1 with ttj ^ 0, we modify the ith row Qi(t) of Q(t) by 
setting 



gcd(Qi )7rj (t),g : , > .(t)) gcd(Qi i ^.(t),g j ^.(t)) 

Note that this leads to = for all j with ^ ^ 0. Here the SINGULAR [22] interface 

for GAP is used to calculate the greatest common divisors. 

The next step depends on the set Lj of non-zero polynomials in Qi(t) which are not 
divisible by any polynomial in A. 

Case la: Li = 0: In this case, we have that i is a ramification point of x c (a) for all relevant 
a, and we set 

• 7T := (vr,0), 

• c := (c, R n ), and 

• S:=SU{((cM,A,B,7i,a,Q(t))}. 

Case lb: There exists Qi,i{t) G Lj such that Q^i is a monomial or divides an f G B: In this 
case, we have that i is a inert point of x c (a) for all relevant a, and we set 

• 7T := {re, /), and 

• c := (c,I). 

Case 1c: Li ^ 0, 6w£ no Qij(t) as in Case lb exists: In this case i can be either an inert 
point or a ramification point of x c (a) for relevant a. Here we pick a Qi t i{t) G that is 
minimal with respect to a total order on Z[£], comparing first the number of terms of two 
polynomials, then their degrees and finally their leading coefficients. Then we apply some 
new subroutines, the polynomial-resolving subroutine and the stack- generating subroutine 
and update the data as specified by these subroutines. 

These subroutines are a substantial improvement to the algorithm from [12]. Their aim is 
to determine the triples (c L , A\ B L ) for i — 1, . . . , k c mentioned above such that (3.1) holds. 

The algorithm aims to construct the sets A L and B L in some way so that each polynomial 
in C L := (A L \ A) U (B L \ B) is linear in one of the indeterminates tj. Often the elements 
of C L are irreducible factors of Qi t i(t). Though the situation can get considerably more 
complicated: when a polynomial / = h\t^ + h 2 is linear in the indeterminate tk, where 
hi,h,2 G Z[£i, . . . ,t m ] are polynomials not involving tk, then it is also necessary to consider 
when the polynomials hi and h 2 give zero values. The SINGULAR [22] interface for GAP is 
also used in these processes. 

The variable a is used to record which indeterminate a polynomial in A or B is linear in. 
So if we have found that / = h\t^ + h 2 is linear in t^, then we set c(f) = If / is not linear 
in any indeterminate, then we set o~(f) = 0. If there is more than one such tk, the program 
choses the tk which is most "suitable" for subsequent calculations. 

Often when we have a polynomial / G A which is linear in an indeterminate we perform a 
substitution to reduce the number of indeterminates. This is done when tk appears linearly 
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in some / = h 2 tk — hi G A. Then we substitute % for ^ in Q(t) as well as in all other 
elements of A U B. 

A "trick" that the program sometimes uses in the polynomial resolving subroutine is to 
make a linear change of coordinates in the indeterminates so that a polynomial becomes 
linear in an indeterminate. For example, there might be a polynomial of the form f(t) = 
{t l + t 2 f + t l e A. Since neither t\ nor t 2 appear linearly in f(t), we cannot solve for either 
of them. By introducing a new variable z 2 := t\ + 1 2 and replacing f(t) by z\ +t\, we are 
able to solve for t± and then make a substitution. Implementing routines to look for such 
substitutions was a huge challenge, and then it also involved updating other parts of data 
accordingly. 

When these two subroutines are complete the algorithm has calculated the 4-tuples 
(c\ A L , B L , a L ). Then the data is updated as follows. 

• TT := (tt,Z), 

• c := c 1 = (c,I), 

• A:= A 1 = A, 

• B := B 1 , 

• o := a 1 , and 

• S := S U {(&, A\ B\ 7i, o\ Q(t)) \l = 2, . . . , k c }. 

Case 2: If c has length N, then we have determined the family X Cj a,b- The program now 
applies a subroutine on (c, A, B, a) to attempt to calculate the number of F g -rational points 
of the family X c ^b- This subroutine is called the nice conditions subroutine; it constitutes 
a major improvement to the algorithm from [12]. 

The first step in the nice conditions subroutine involves checking whether each polynomial 
/ G A U B is linear in some tj, i.e. / = h\tj + h 2 , where hi,h 2 G K[ti, . . . , t m J not involving 
tj. This is first done using a, but further checks are made to see if elements in A U B have 
become linear in an indeterminate as a consequence of substitutions being made at some 
point during the run. Then the values a for which f(a) = are given by a, = — j^j, so 
it is relatively straightforward to count the number of a for which f(a) is zero or non-zero. 
If this can be applied to all the / G A U B, then |X c (g)| can be calculated. However, a 
great deal of care needs to be taken here as there are a variety of potential complications, for 
example h\(a) or h 2 (a) could be zero. Also there will be a number of dependencies between 
the conditions calculated for different / G A U B, so the algorithm is required to make many 
checks before it is able to complete the calculation. If the subroutine is successful, then it 
outputs d = m c — \A\ and a tuple n(c,o~) which contains information about the number of 
indeterminates that can take q — 1 —j values, for each possible j. The variables d and n(c, a) 
are later used to calculate \X c a,b{q)\ as explained below. 

The algorithm updates the data depending on the outcome of the nice conditions subrou- 
tine. 

(1) If the subroutine fails to calculate the number of F g -rational points in X c> a,Bi then 
it sets O := O U {(c, A, B)}. We name such families bad families, as in this case this 
output would need to be analyzed by hand to determine \X c a,b((1)\- For the cases, 
where we have run the algorithm, we end up with = 0, which is a significant 
betterment on the algorithm from [12]. 
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(2) If the subroutine was successful in calculating \X c> a,b{<i)\, then we call the family 
a good family. In this case 7 is updated by setting 7 := 7 U {(d,n(c,a))}, where 
d = m c — \A\ and n(c, a) are as above. 

The algorithm proceeds by updating the data from the stack as follows. If S = 0, then 
the main loop terminates. Else we update the variables: 

• (c,A,B,7r,a,Q(t)) =:top(S), 

• S:=S\{top(S)}, 

where top(S) is the last element in the ordered set S. 

After the end of the main loop, the data from 7 is used to calculate a polynomial Z(q). 
This polynomial is the sum of \X c> A,B(q)\, where (c, A, B) runs over all families of minimal 
representatives for which the program calculated \X Cj a,b{q)\- When (c,A,B) corresponds to 
the element (d, n(c, a)) G 7, we have 



\X CiA , b (q)\ = (? - TT(g - 1 



\n{c,a)j 



We note that it turns out that in most cases A = B = 0, so the cardinality is a given by 
\X c ^a,b{o)\ = (q — l) mc ; more complicated situations occur rarely when G has small rank, 
but with increasing frequency for higher ranks. If O = 0, then Z(q) = k(U(q)). Otherwise 
one would have to calculate the number of the F 5 -rational points of the families in O by 
hand. As already mentioned, such hand calculations are not required for the cases on which 
we have run the algorithm. 

We remark that the usage of Qz makes it necessary to be careful about implicit divisions 
occurring during the calculations. Therefore, the program records the primes that occur in 
the numerator or denominator of any coefficient of a polynomial from A or B. These primes, 
for which our results may not be valid, are then returned at the end. Fortunately, for all 
calculations that we have completed, these primes were bad primes. 

Finally we mention that the algorithm attempts to normalize coefficients when possible 
to reduce the number of indeterminates required. The maximal torus T acts on the sets of 
minimal representatives and can be used to normalize certain coefficients of x c (t) to be equal 
to 1 as explained in [12, §3]. Let c G {I, Ro, Rn}^ and let J be a linearly independent subset 
of {/3 C) i, . . . , Pc,m c }- Then we can find for any a G (K x ) mc an element b G (K x ) mc such that 
bj = 1 if (3 C j G J and x c (a) = t ■ x c (b) for some t G T. This implies that for i G {1, . . . , N}, 
we have that % is an inert point of x c (a) if and only if it is an inert point of x c (b). This trick 
is useful since it reduces the amount of indeterminates which arise in the program, though 
we have to take care here: it may be the case that x c (a) and x c (b) are not conjugate by an 
element of T(q). The algorithm has a routine to normalize coefficients to be equal to 1 as 
above, as long as this is possible by elements in U(q). 

We also remark here that for G(q) with root system of type B 7 and Cj, it turned out 
that there are situations where, surprisingly, the normalization of certain coefficients leads 
to more complicated polynomials in the sets A and B. This meant that we sometimes had 
to manually override some normalizations. 

9 



4. The results 



Table 1 contains the values of k(U (q)) for G(q) simple of rank at most 7, except £7, written 
as polynomials in v := q — 1. The polynomials up to rank 5 were already calculated in [12], 
while the polynomials for G(q) of type A r , r < 12, were given in [24]. The newly obtained 
polynomials are colored red within the tables. 



G(q) 


k(U(q)) 


Ax 


v + 1 


A 2 
B 2 
G 2 


v z + 3v + 1 

2v 2 + Av + 1 

v 3 + 5v 2 + 6v + 1 


A3 

B31 C3 


2?r + 7ir +6^ + 1 

v 4 + 8w 3 + 16w 2 + 9v + 1 


A, 

B4, C4 

D 4 

F 4 


5v 4 + 20w 3 + 25w 2 + 10^+1 

+ lit; 5 + 48v 4 + 88 v 3 + 64v 2 + lQv + 1 
2w 5 + 15t> 4 + 36w 3 + 34v 2 + 12^ + 1 

v s + 9v 7 + 40w 6 + 124w 5 + 256w 4 + 288w 3 + 140v 2 + 24v + 1 


A 5 

B5, c 5 

D 5 


+ 18v 5 + 70v 4 + 105-u 3 + 64v 2 + 15^ + 1 
2u s + 24?/ + 132t> b + 395-iT + 630u 4 + 50Oir + 180v z + 25t> + 1 
2v 7 + 22w 6 + 106v 5 + 235v A + 240v 3 + 110w 2 + 20v + 1 


A e 

Be, C*6 
As 


8v 7 + 84u 6 + 301w 5 + 490i7 4 + 385t> 3 + 140t> 2 + 21v + 1 

v 11 + 15v w + 112w 9 + 547w 8 + 1845w 7 + 4121w 6 + 5701w 5 + 4560w 4 + 1960w 3 

+ 410w 2 + 36v + 1 

t? 10 + 13t? 9 + 87w 8 + 393v 7 + 1157-y 6 + 2032t; 5 + 2005w 4 + 1060t> 3 + 275t> 2 
+ 30v + 1 

nil 1 1 o„,10 1 7K„,9 1 qco,,8 j_ i qq«,„7 _i_ 01 70,,, 6 1 a 77^,5 1 ac\qk„A , 1 qaa,,,3 

+ 390t; 2 + 36v + 1 


A, 

B 7 , C 7 
D 7 


4v 9 + 74v 8 + 496u 7 + 1568w 6 + 2604w 5 + 2345t; 4 + 1120w 3 + 266w 2 + 28v + 1 
v u + 18v i3 + 158v i2 + 899 ^ii + 374(^10 + hqs5v 9 + 29328v 8 + 52055v 7 

+ 62930-l- 6 + 48797w 5 + 22855't; 4 + 6020y 3 + 812w 2 + 49t; + 1 

4w 12 + 59w n + 417w 10 + 1913w 9 + 6256w 8 + 14289w 7 + 21497w 6 + 20188w 5 

+ 11305-t; 4 + 3570t; 3 + 58b.; 2 + 42t; + 1 




Table 1. k(U(q)) as polynomials in v — q — 1. 
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The most noteworthy observation is that the polynomials for B r and C r coincide for fixed 
r. This has already been noticed for r < 5, and the equality still holds for r = 6, 7. 

Another phenomenon that was already observed in [24] for type A r and in [12] for rank 
at most 5 is that the coefficients of k(U(q)) written as polynomials in v are non-negative 
integers. A heuristic idea why this may be the case was given in loc. cit. 

It is noteworthy that the constant coefficient of all calculated polynomials is equal to one. 
We explain this and prove explicit formulas for the coefficients of degrees one and two in the 
next section. 

Eamonn O'Brien used the p-group conjugacy algorithms available in MAGMA [3] to confirm 
the values for k(U(p)) in each of the cases in Table 1 for p the smallest good prime for G. 

We also used a modification of our program to calculate the number k(U(q),U^(q)) of 
[/(g)-conjugacy classes in the l-th term of the descending central series of U(q), for certain 
groups and I G N. Here we also see that these numbers are given by polynomials in v with 
non-negative integer coefficients. The results of these calculations are presented in Table 2. 



G(q) 


I 


k{U(q),U®(q)) 


E 7 


2 


v 14 + Uv 13 + 92v 12 + 380v u + 1128v 10 + 2675v 9 + 5694v 8 + 11565u 7 






+ 19486v 6 + 21745?; 5 + 13976v 4 + 4724v 3 + 755v 2 + 50v + 1 




3 


3v w + 37v 9 + 253v 8 + 1193v 7 + 3767v 6 + 6724v 5 + 6194v 4 + 2798v 3 + 560v 2 






+AAv + 1 




4 


v 9 + 13v 8 + 94v 7 + 512v 6 + 1600v 5 + 2312v 4 + 1499v 3 + 395v 2 + 38v + 1 


E 8 


7 


2v 13 + 28w 12 + 188?; 11 + 822u 10 + 2838v 9 + 8987v 8 + 25419v 7 + 51513v 6 






+60889v 5 + 37867-t; 4 + 11140v 3 + 1428v 2 + 70v + 1 




8 


v 12 + Uv 11 + 94v w + U9v 9 + 1830v 8 + 6381v 7 + 16610v 6 + 25867v 5 






+20935v 4 + 7620v 3 + 1155v 2 + 64v + 1 




9 


v w + 2k, 9 + I99v 8 + 1125v 7 + 4228v 6 + 9382v 5 + 10568-y 4 + 4955v 3 + 912v 2 






+58v + 1 




10 


v 9 + 17v 8 + 135v 7 + 719v 6 + 2568v 5 + 4652v 4 + 3014v 3 + 699v 2 + 52v + 1 



Table 2. k(U(q), U^ l \q)) for E-j and E% as polynomials in v — q — 1. 



5. The coefficients of k(U(q)) of small degree 

Assuming that k(U(q)) is a polynomial in v, we now prove that the coefficients of k(U(q)) 
of degrees zero, one and two can be easily determined based on properties of the root system. 
We start with an elementary lemma; since we were unable to find a proof in the literature, 
we give a complete argument. 

Lemma 5.1. Let $ be an irreducible root system of rank r > 3. Let a, and 7 be three 
pairwise distinct linearly dependent positive roots with ht 7 > max{ht a, ht (3}. Then at least 
one of the following statements is true: 
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• (3 - a G 

• 7 - /3 G $ and 7 - a 6 $, 

• 7 — a G $, frut /3 + 7 — a ^ $, or 

• 7 - /3 G but a + 7 - $. 

Proof. Choose an embedding of $ into the real vector space V. It is a well-known fact (e.g. 
[17, Ex. 9.7]) that, if V C V is the subspace spanned by a and /3, then $ D V =: $' is a 
root system of rank two. Since r > 3, we know that $' is a proper root subsystem of $. 
Because two distinct positive roots are linearly independent, 7 can be written as a linear 
combination of a and /3, and thus 7 lies in Let $ /+ := $' D $ + , then the three roots are 
also positive in and ht7 > max{ht a, ht /?} stays true in 

There are three types of root systems of rank two which might appear as proper root 
subsystems in an irreducible root system {G2 never does). We denote by S and e the simple 
roots of $' and proceed by case-by-case analysis. 

$' of type A\ x A\: This is not possible, since a, (5 and 7 are pairwise distinct. 

$' of type A 2 : We have a bijection between {a, (3, 7} and $ /+ = {5, e, 5 + e}. Then 7 = 5+e 
due to the height condition, and then the second statement is true. 

<£>' of type B 2 : Here $ /+ = {5, e, 5 + e, 5 + 2e}. If 7 = 5 + e, then the second statement 
is true again. The remaining case is when 7 = 5 + 2e. If either a or /3 is 5 + e, then the 
other one is a simple root and the first statement holds. So suppose that {«,/?} = {5, e}. 
Depending on whether a is equal to 5 or e, the third or the fourth statement holds. □ 

Lemma 5.2. Let I be a subset of Q + , define the span 




and let /3i, . . . , flk be linearly independent roots in I. Let f3j be the coordinate vector of (3j 
with respect to the base of $ determined by $ + , for 1 < j < k. Let d\, . . . ,dk be the diagonal 
entries of the Smith normal form of the matrix . . . , $k) G Mat rX fc(Z). 

Then for each x G Xj(q), the size of the orbit T(q)-x is divisible byv k /d, where v :— q — 1 
and d := Yli=i E, c< ^{di, v}. 

Proof. Since . . . are linearly independent, they form a basis for the sub-lattice L : = 
+ ■ ■ ■ + Z/3fc C Z$. The theory of finitely generated abelian groups makes it possible to 
find a basis xi, ■ ■ ■ 1 Xr of the character group X(T) such that diXi, • • • , dkXk is a basis for L. 
Write f3j = Yli=i c ijdiXi- Then the matrix (cij)ij G Matfc X fc(Z) is invertible and we denote by 
i a jh)j,h G Matfcxfc(Q) its inverse. Let ipi, . . . , if) r G X V (T) be dual to Xi, ■ ■ ■ ,Xr with respect 
to the perfect pairing (-, ■) on X{T) x X V (T), i.e. (ipi,Xj) — $ij- 

It is known that f3j o ^,(6) = 6^''^> and t • = /^(^e^, for b G if x and t ET. By looking 
at the coefficient of t ■ x belonging to ep, we get that t = n[=i^(^) ^ Ct(^) satisfies 

k 

Yltf 1 ^ 1 = 1 for all 1 < j < k. 

1=1 
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n ikh 

3=1 \l=l 



a ih 

' + h 



k G K x for all 1 < I < r, bf = 1 for all 1 < I < k 



Moreover, it follows from 
that 

C T (x)CS:= lf[Mk 
U=l 

If we take x G Xi(q) and consider the action of T(q) on Xi(q), then Cr(q)(x) C S(q). Because 
of the condition on the bj to be dj-th roots of unity and to lie in ¥ q (i.e. to be v-th roots of 
unity as well), the order of S(q) is v r ~ k d, thus the order of CT( q )(x) divides this number. By 
the orbit stabilizer theorem, T(q) ■ x has size v r /\CT( q ){x)\, which is divisible by v k /d. □ 

Lemma 5.3. Let (3 1 , . . . , be linearly independent roots in $ + . Then: 

(1) T acts transitively on X := {$^7=1 a j e /3] I a j e X x }, and 

(2) dim c u (x + rrij) = dim c u (y + trij) for all x,y G X , 1 < i < N . 

Proof. (1): We define ipi, . . . ,ip r G X V (T) and d\,...,dk as in Lemma 5.2. For x := 
Yl'j=i a j e i3j e X and t = n[=i^(^0 ^ follows that t • x = Y^j=\ a i e Py By taking 6j 
to be a j-th root of unity of aj 1 for all j, we see that x lies in the same T-orbit as Y^!j=i e Pr 
By transitivity, all x G X are T-conjugate. 

(2): Let x and y = t ■ x be in X. Then also x + m.j = t - (y + m.j) for all 1 < i < iV. We get 

tCu(x + m.j)i _1 = {n := tnt^ 1 G C/ | u ■ (x + ttlj) = 2 + m^} 
= {u G f/ I -ut • (x + trij) = t • (x + rrij)} 
= {ueU \u-(y + xrii)^ 1 = y + m;} 
= Ct/^ + mi), 

so dimC[/(x + irij) = dimCu(y + rrij). Now (2) follows from [8, Cor. 4.3]. □ 

Theorem 5.4. Ifk(U(q)) is given by a polynomial in v := q— 1, then the following statements 
hold: 

(1) The coefficient of degree zero equals 1. 

(2) The coefficient of degree one equals \ $ + 1 . 

(3) The coefficient of degree two equals k ) G $ + x $ + | j < k, f3 k - Pj ^ *}|- 

Proof. (1): We want to prove that k(U(q)) — 1 is divisible by v/d for some fixed d G N. 
First, we note that there is exactly one family of minimal representatives with all coefficients 
being zero, namely {0} C u. Now, if X c is a different set of minimal representatives, then 
there is at least one non-zero coefficient (i.e. m c > 0). Lemma 5.2 with k = 1 and 0i = (3 C 
yields that there is a d C)V G N such that the size of each T(g)-orbit on X c (q) (and thus the 
cardinality of X c (q)) is divisible by v/d CyV . Take 

d v := \cm.{d C)V \ X c set of minimal representatives with m c > 0}. 

Then v/d v divides |X c (g)| for all X c with m c > 0. Thus v/d v divides k(U(q)) — 1. Due to 
the definition of the d CjV in Lemma 5.2, there is a d G N such that d v divides d for all v G N. 
Since k(U(q)) is a polynomial in v, it follows that v/d divides k(U(q)) — 1 as a polynomial. 
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Note that for k = 1 one can give a simpler argument because d C;V = 1 for all c, v. However, 
the argument above is needed for (2) and (3). 

(2) : Let X c be a set of minimal representatives with m c = 1, i.e. X c C {ajCp. \ aj G K x } 
for some 1 < j < N. It follows from Lemma 5.3(1) and [8, Lem. 7.2] that |X c (g)| = v. The 
number of such sets is N = |$ + |. 

Now we consider sets of minimal representatives with m c > 1. Since two distinct positive 
roots are linearly independent, we can use again Lemma 5.2 (this time with k = 2) and argue 
as in (1) that k(U(q)) — |$ + |t> — 1 is divisible by v 2 /d for some deN. 

(3) : Consider a set X c C {a je$. + akCp k | aj, ak G K x } of minimal representatives that has 
two non-zero coefficients, with j < k. Lemma 5.3(2) implies that whether k is an inert point 
of x G X c only depends on j (and not on aj). Thus we can consider x := ep j + ep k . The fact 
that k is a ramification point of x is equivalent to there being no positive root a such that 
[e/3 j; e a ] = cep k , with c ^ (else dimc u (a; + m^) = dimc u (x + ttife_i) — 1). By Chevalley's 
commutator formula, a = Pk — Pj, i-e. Pk — Pj must not be a root. 

It follows that there are as many sets X c of minimal representatives with m c = 2 as there 
are tuples (Pj, Pk) of positive roots with j < k and Pk — Pj ^ Because of Lemma 5.3(1) 
and [8, Lem. 7.2], it follows that |^ c (<?)| = v 2 for these X c . 

Now, let X c be a set of minimal representatives with more than two non-zero coefficients, 
and let the first three of them (with respect to our ordering) belong to e^., ep k and ep v We 
first want to prove that Pj, Pk and Pi must be linearly independent. We can use again Lemma 
5.3(2) and consider the sum ep. + e@ k + ep r From the fact that j, k and I are ramification 
points we can deduce the following facts: 

• Pk — Pj is not a root. This follows similarly as in (2), because k is a ramification 
point. 

• Not both Pi — Pj and Pi — Pk are roots. Otherwise, if Pi — Pj =: j3 m and Pi — Pk =: P n , 
then I being a ramification point implies that there is a dependence between the 
coefficients of ep m and ep n in c u (ep j + ep k + e^). This dependence could have only 
originated from an earlier inert point s, and so [ep m ,ep k ] and [ep n ,ep j ] must be non- 
zero elements from the root space ep s . Using again Chevalley's commutator formula, 
we get that P m + Pk = Pj + Pn- Together with m + Pj = p k + P n this leads to Pj = Pk, 
a contradiction. 

• If Pi — Pj =: P m is a root, then Pk + Pm is also a root. Since I is a ramification point 
in spite of [ep^ep^] = cep l for some c ^ 0, the centralizer c u (ep j + ep k + ep t ) must 
consist of elements with the coefficient of ep m being zero. This must originate from 
an earlier inert point s < I, which means that [ep m ,ep k ] = cep g for some c ^ 0. The 
Chevalley commutator formula yields that P m + Pk = Ps- 

• If Pi — Pk is a root, then Pj + Pi — Pk is also a root. This follows analogously to the 
previous statement. 

If our group G has rank one or rank two, the statement of the theorem follows from the 
respective polynomials which are known. If the rank of G is bigger than two, then the linear 
independence is a direct result of Lemma 5.1, using contraposition. 

Now we can again use Lemma 5.2 (with k = 3) and argue as in (1) and (2) that 

HU(q)) - \{(Pj,P k ) G $ + x $+ | j < k, p k - P 3 i $}\v 2 - |$> - 1 



is divisible by v 3 /d for some deN. 
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□ 



We summarize the idea behind the proof of the formulas in Theorem 5.4: Suppose that we 
have a set J of k positive roots in a root system of rank r > k, where k < 3. If there exists 
a family X c of minimal representatives such that the nonzero coefficients of the elements of 
X c correspond to the roots in J, then the roots in J must be linearly independent. This 
statement is trivial for k = 1 and k = 2, but some work is required for k = 3. 

Example 5.5. It is not possible to use a similar argument in order to determine a formula 
for the coefficients of degrees three and higher: Let $ be of type C5 with basis {a\, . . . , a^}, 
where a 5 is long, and consider J := {a 2 , «5, 03 + 0^ + 05, 2a^ + 2a 4 + 05}. Then the set 

X c := {x = a 1 e / s 1 H h a^e^ | a* 7^ <^> ft G J} 

is a family of minimal representatives that occurs for type C5, but the roots in J are not 
linearly independent any more. So the aforementioned statement does not hold for k > 3. 
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